Language Information in Structured Documents: A Model for Mark-up and Rendering
نویسندگان
چکیده
In this paper we discuss the structure and processing of multi-lingual documents, both at a general level and in relation to a proposed extension to the (no longer so new) standard LTEX. Both in general and in the particular case of this proposal, our work would be impossible without the enormous support, both practical and moral, we get from our fellow members of the LTEX3 project team (who maintain and enhance LTEX) and from people all over the world who contribute to the development of LTEX with their suggestions and comments.
منابع مشابه
Possibilities and Constraints for Managing and Reusing Information Content of Structured Documents: The Case of Operation and Maintenance Manuals
With the overwhelming amount of digital information produced nowadays it is necessary that we promote the importance of information content reuse. A great deal of explicit knowledge and other important company information is stored as documents. Structured documents, i.e. SGML and XML documents, contain mark-up that can be used as metainformation about document content. The markup in structured...
متن کاملThe integration of information retrieval techniques within a software reuse environment
This paper describes the development of an information retrieval (IR) model for the indexing, storage and retrieval of documents created in extensible mark-up language (XML). The application area is the software reuse environment, which involves a broader class of documents than can be processed by conventional IR systems. This includes design and analysis documents in unified modelling languag...
متن کاملA New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملGDOC: A System for Storage and Authoring of Documents through WEB Browsers
The paper addresses the problem of the storage of semi-structured informarion (as documents are) in a relational database, as well as the problem of authoring and retrieving documents over an Intranet. The proposed solution combines two technologies, relational database systems and a text mark-up language (HTML) to store documents and uses WEB browsers and Java to access documents. The paper pr...
متن کاملStatistical Language Models for Intelligent XML Retrieval
The XML standards that are currently emerging have a number of characteristics that can also be found in database management systems, like schemas (DTDs and XML schema) and query languages (XPath and XQuery). Following this line of reasoning, an XML database might resemble traditional database systems. However, XML is more than a language to mark up data; it is also a language to mark up textua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998